Physiologically Motivated Audio-Visua

نویسنده

  • Stuart N. Wrigley
چکیده

An audio-visual localisation and tracking system for meeting scenarios is presented which draws its inspiration from neurobiological processing. Meetings are recorded by a KEMAR binaural manikin and a single camera placed directly above the manikin. Source localisation from the binaural audio and face, object and motion locations from the video frames are used as input to two linked neural oscillator networks. The strength of the connections between the two networks determines the mapping between activity at a particular audio azimuth and activity at a particular visual frame column. A Hebbian learning rule is used to establish the connection strengths. The combined network segments the video and audio features and then produces audio-visual groupings on the basis of common spatial location. The audio-visual groupings are tracked through time using a mechanism based upon that of the human oculomotor system which incorporates smooth pursuit and saccadic movement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Physiologically Inspired Method for Audio Classification

We explore the use of physiologically inspired auditory features with both physiologically motivated and statistical audio classification methods. We use features derived from a biophysically defensible model of the early auditory system for audio classification using a neural network classifier. We also use a Gaussian-mixture-model (GMM)-based classifier for the purpose of comparison and show ...

متن کامل

Potential relevance of audio-visua computational

The purpose of this study was to examine typically developing infants’ integration of audio-visual sensory information as a fundamental process involved in early word learning. One hundred sixty pre-linguistic children were randomly assigned to watch one of four counterbalanced versions of audio-visual video sequences. The infants’ eye-movements were recorded and their looking behavior was anal...

متن کامل

Physiologically motivated audio-visual localisation and tracking

An audio-visual localisation and tracking system for meeting scenarios is presented which draws its inspiration from neurobiological processing. Meetings are recorded by a KEMAR binaural manikin and a single camera placed directly above the manikin. Source localisation from the binaural audio and face, object and motion locations from the video frames are used as input to two linked neural osci...

متن کامل

Real-time Synthesis of Chinese Visua using MPEG-4 FAP Features in a

This paper describes our initial work in developing a real-time audio-visual Chinese speech synthesizer with a 3D expressive avatar. The avatar model is parameterized according to the MPEG-4 facial animation standard [1]. This standard offers a compact set of facial animation parameters (FAPs) and feature points (FPs) to enable realization of 20 Chinese visemes and 7 facial expressions (i.e. 27...

متن کامل

Physiologically-motivated synchrony-based processing for robust automatic speech recognition

This paper describes the structure and performance of a new signal processing scheme, motivated by the physiology of the peripheral auditory system, that improves speech recognition accuracy in the presence of broadband noise. An important attribute of the peripheral processing is a novel mechanism to represent the cycle-by-cycle synchrony in the response of low-frequency auditory-nerve fibers,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005